On Regression-Tree-Based Synthetic Data Methods for Business Data
نویسندگان
چکیده
The challenge of balancing the competing objectives of allowing statistical analysis of confidential data and maintaining confidentiality is of great interest to national statistical agencies and other data custodians seeking to make their data available for research. This balance is often characterised as a trade-off between disclosure risk and data utility, where disclosure risk attempts to capture the probability of a data release resulting in a disclosure, while data utility attempts to capture some measure of the usefulness of the released data, see [6]. To date, most of the literature on addressing this balance has focussed on data about individuals, however, the same problem arises in the context of data about businesses and enterprises. It is the purpose of this paper to provide an empirical evaluation of existing methodology for individual data being applied to business data.
منابع مشابه
Forest Stand Types Classification Using Tree-Based Algorithms and SPOT-HRG Data
Forest types mapping, is one of the most necessary elements in the forest management and silviculture treatments. Traditional methods such as field surveys are almost time-consuming and cost-intensive. Improvements in remote sensing data sources and classification –estimation methods are preparing new opportunities for obtaining more accurate forest biophysical attributes maps. This research co...
متن کاملModelling Customer Attraction Prediction in Customer Relation Management using Decision Tree: A Data Mining Approach
In Today’s quality- based competitive world, known as knowledge age, customer attraction is of ultimate importance. In respect to the slogan “customer is always right”, customer relation management is the core of an organizational strategy playing an important role in four aspects of customer identification, customer attraction, customer retaining, and customer satisfaction. Commercial organiza...
متن کاملThe application of data mining techniques in manipulated financial statement classification: The case of turkey
Predicting financially false statements to detect frauds in companies has an increasing trend in recent studies. The manipulations in financial statements can be discovered by auditors when related financial records and indicators are analyzed in depth together with the experience of auditors in order to create knowledge to develop a decision support system to classify firms. Auditors may annot...
متن کاملA hybrid model based on machine learning and genetic algorithm for detecting fraud in financial statements
Financial statement fraud has increasingly become a serious problem for business, government, and investors. In fact, this threatens the reliability of capital markets, corporate heads, and even the audit profession. Auditors in particular face their apparent inability to detect large-scale fraud, and there are various ways to identify this problem. In order to identify this problem, the majori...
متن کاملAn Effective Tree-Based Algorithm for Ordinal Regression
Recently ordinal regression has attracted much interest in machine learning. The goal of ordinal regression is to assign each instance a rank, which should be as close as possible to its true rank. We propose an effective tree-based algorithm, called Ranking Tree, for ordinal regression. The main advantage of Ranking Tree is that it can group samples with closer ranks together in the process of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013